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Abstract 

Many of the type 2 diabetes loci identified through genome-wide association studies localize to non-protein-coding intronic 
and intergenic regions and likely contain variants that regulate gene transcription. The CDC123/CAMK1D type 2 diabetes 
association signal on chromosome 10 spans an intergenic region between CDC123 and CAMK1D and also overlaps the 
CDC123 3'UTR. To gain insight into the molecular mechanisms underlying the association signal, we used open chromatin, 
histone modifications and transcription factor ChlP-seq data sets from type 2 diabetes-relevant cell types to identify SNPs 
overlapping predicted regulatory regions. Two regions containing type 2 diabetes-associated variants were tested for 
enhancer activity using luciferase reporter assays. One SNP, rsl 1257655, displayed allelic differences in transcriptional 
enhancer activity in 832/13 and MIN6 insulinoma cells as well as in human HepG2 hepatocellular carcinoma cells. The 
rsl 1257655 risk allele T showed greater transcriptional activity than the non-risk allele C in all cell types tested. Using 
electromobility shift and supershift assays we demonstrated that the rsl 1257655 risk allele showed allele-specific binding to 
FOXA1 and FOXA2. We validated FOXA1 and FOXA2 enrichment at the rsl 1257655 risk allele using allele-specific ChIP in 
human islets. These results suggest that rsl 1257655 affects transcriptional activity through altered binding of a protein 
complex that includes FOXA1 and FOXA2, providing a potential molecular mechanism at this GWAS locus. 
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Introduction 

Type 2 diabetes is a complex metabolic disease with a 
substantial heritable component [1]. Over the past seven years, 
genome-wide association studies (GWAS) have successfully iden- 
tified over 70 common risk variants associated with type 2 diabetes 
[2-5]. Association signals at many of these loci localize to non- 
protein-coding intronic and intergenic regions and likely harbor 
regulatory variants altering gene transcription. In recent years 
great advances have facilitated identification of regulatory 
elements genome-wide using techniques including DNase-seq 
and FAIRE-seq (formaldehyde-assisted isolation of regulatory 
elements), which identify regions of nucleosome depleted open 
chromatin, and ChlP-seq (chromatin immunoprecipitation), 
which identify histone modifications to nucleosomes and tran- 
scription factor binding sites. Several studies have successfully 
integrated trait-associated variants at GWAS loci with publicly 
available regulatory element datasets in disease-relevant cell types 



to guide identification of regulatory variants underlying disease 
susceptibility [6-10]. 

The CDC123 (cell division cycle protein 123)/CAMK1D 
(calcium/calmodulin-dependent protein kinase ID) locus on 
chromosome 10 contains common variants (MAT>.05) strongly 
associated with type 2 diabetes in Europeans (rsl2779790, 
P = 1.2xl0" 10 ) [3], East Asians (rsl09061 15, P = 1.5 x 10" 8 ) [4], 
and South Asians (rsl 1257622, P = 5.8xl0" 6 ) [5]. Fine-mapping 
using the Metabochip identified rsl 1257655 as the lead SNP [2]. 
The index variant and proxies (r 2 >.7) span an intergenic region of 
at least 45 kb between CDC 123 and CAMK1D and overlap the 3' 
end of CDC 123 [3]. None of the type 2 diabetes-associated 
variants at this locus are located in exons. Analysis of the beta cell 
function measurements HOMA-B and insulinogenic index, 
derived from paired glucose and insulin measures at fasting or 
30 minutes after a glucose challenge, demonstrated association of 
the risk allele at the CDC123/CAMK1D locus with reduced beta 
cell function, suggesting the beta cell as a candidate affected tissue 
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Author Summary 

GWAS have identified more than 1200 loci contributing to 
risk of disease, including more than 70 loci associated with 
type 2 diabetes. With a majority of associated variants 
localized to non-coding regions of the genome, focus has 
moved to identifying the functional variants explaining the 
association signals. One mechanism by which variants may 
act is to affect activity of enhancer elements regulating 
target gene expression. In this study, we take advantage of 
recent advances in genome-wide annotation of human 
regulatory elements to prioritize candidate functional 
variants at the CDC123/CAMK1D locus. We identify two 
T2D-associated variants that overlap predicted regulatory 
enhancer elements. We demonstrate that one variant, 
rsl 1257655, shows allele-specific transcriptional enhancer 
activity in mammalian cell lines relevant to type 2 diabetes. 
We also show differential protein-DNA binding suggesting 
that the rsl 1257655 type 2 diabetes- risk allele increased 
transcriptional activity through binding a protein complex 
that includes FOXA1 and FOXA2. This study demonstrates 
that genome-wide maps of regulatory elements are a 
useful resource to guide identification of variants differ- 
entially affecting transcriptional activity and provides 
insight into molecular mechanisms underlying a T2D 
susceptibility locus. 

[2,11]. Another intronic variant (rs7068966, r 2 = 0.18 EUR, 
1000G Phase 1) located 50 kb away from rsl2779790 is associated 
with lung function [12]. 

The transcript(s) targeted by risk variant activity at this locus 
remain unknown. CDC123 is regulated by nutrient availability in 
yeast and is essential to the onset of mRNA translation and protein 
synthesis through assembly of the eukaryotic initiation factor 2 
complex [13,14]. Evidence from previous GWA studies suggest 
cell cycle dysregulation as a common mechanism in type 2 
diabetes; for example, type 2 diabetes association signals are found 
close to the cell cycle regulator genes, CDKN2A/CDKN2B and 
CDKAL1 [15]. CAMK1D is a member of the Ca 2+ /calmodulin- 
dependent protein kinase family which transduces intracellular 
calcium signals to affect diverse cellular processes. Upon calcium 
influx in granulocyte cells and hippocampal neurons, CAMK1D 
activates CREB-dependent gene transcription [16,17]. Given the 
roles of cytosolic calcium in regulation of beta cell exocytotic 
machinery and oiCREB in beta cell survival, CAMK1D may have 
a role in beta cell insulin secretion. In CM-eQTL analyses, the 
rsl 1257655 type 2 diabetes risk allele was more strongly and 
directly associated with increased expression of CAMK1D than 
CDC123 in both blood and lung [18,19]. 

In this study we aimed to identify the variant(s) underlying the 
association signal at the CDC123 1 'CAMK1D locus using genome- 
wide maps of open chromatin, chromatin state and transcription 
factor binding in pancreatic islets, hepatocytes, adipocytes and 
skeletal muscle myotubes. We measured transcriptional activity of 
variants in putative regulatory elements using luciferase reporter 
assays, and identified a candidate cw-acting SNP driving allele- 
specific enhancer activity in two mammalian beta cell-lines as well 
as hepatocellular carcinoma cells. We then evaluated DNA- 
protein binding in sequence surrounding this variant and 
identified allele-specific binding to key islet and hepatic transcrip- 
tion factors. Thus, our study provides strong evidence of a 
functional variant underlying the type 2 diabetes association signal 
at the CDC123/CAMK1D locus acting through altered regulation 
in type 2 diabetes-relevant cell types. 



Results 

Prioritization of type 2 diabetes-associated SNPs with 
regulatory potential at the CDC123/CAMK1D locus 

To identify potentially functional SNPs at the CDC123/ 
CAMK1D locus, we considered variants in high LD (r 2 >.7, 
EUR, 1000G Phase 1 release) with GWAS index SNP 
rsl 2779790. To further prioritize variants for functional follow 
up, we used genome wide maps of chromatin state (Figure 1) in 
available type 2 diabetes-relevant cell types including pancreatic 
islets, liver hepatocytes, skeletal muscle myotubes and adipose 
nuclei. Variant position was evaluated with respect to DNase- and 
FAIRE-seq peaks and several histone modifications, including 
H3K4mel and H3K9ac. DNase and FAIRE are established 
methods of identification of nucleosome depleted regulatory 
regions [20], while H3K4mel and H3K9ac are post-translational 
chromatin marks often associated with enhancer regions [21,22]. 
We also assessed chromatin occupancy by transcription factors 
using available genome wide ChlP-seq data sets. Of 1 1 variants 
meeting the LD threshold, two SNPs were found to overlap 
chromatin signals. One SNP, rsl 1257655 (r 2 = .74 with GWAS 
index SNP rsl2779790), located 15 kb from the 3' end of 
CDC123 and 84 kb from the 5' end of CAMK1D, was a 
particularly plausible candidate overlapping islet, liver and HepG2 
cell line DNase peaks, islet and liver FAIRE peaks, H3K4mel and 
H3K9ac chromatin marks, and FOXA1 and FOXA2 ChlP-seq 
peaks in HepG2 cells (Figure SI). A second SNP, rs34428576 
(r 2 = .71 with rsl2779790), overlapped a HepG2 DNase peak and 
displayed occupancy by FOXA1 and FOXA2 binding in HepG2 
cells (Figure 1). No SNPs overlapped with DNase peaks in skeletal 
muscle myotubes. 

Allele-specific enhancer activity of rsl 1 257655 in islet 
and liver cells 

To evaluate transcriptional activity of the SNPs in predicted 
regulatory regions, 150-200 bp surrounding each SNP allele was 
cloned into a minimal promoter vector and luciferase activity was 
measured in two beta cell lines, 832/13 rat insulinoma and MIN6 
mouse insulinoma cells, and in HepG2 liver hepatocellular 
carcinoma cells. Four to five independent clones for each allele 
were generated and enhancer activity was measured in duplicate 
for each clone. A 151-bp region including rsl 1257655 (and 
rs36062557 due to proximity, r 2 = .38 with rsl 1257655) showed 
differential allelic enhancer activity in both orientations in all three 
cell lines (Figure 2). The risk allele rsll257655-T showed 
significandy increased luciferase activity compared to the non-risk 
allele rsl 1257655-C (forward: 832/13 P = 6.3xl0~ 3 , MIN6 
P=1.7xl0~ 5 ; HepG2 P = 8.0xl0~ 5 ; reverse: 832/13 
P = 2.2xl0~ 3 , MIN6 P = 9.9xl0~ 5 ; HepG2 P=2.0xl0" 3 ). 
Enhancer activity represents greater than a 1.4-fold (HepG2, 
MIN6) to 2.1-fold (832/13) increase in transcriptional activity 
relative to the non-risk allele in both the forward and reverse 
orientations. Compared to an empty vector control, enhancer 
activity was greatest in the islet cell lines (risk allele: 832.13, 4-fold; 
MIN6, 10-fold; HepG2, 1.6-fold). 

A 179-bp region surrounding the second candidate SNP 
rs34428576 showed only moderate allele-specific activity, and 
only in the reverse orientation, in HepG2 cells (P = .02) and no 
allele-specific activity in islet cells (Figure S2). 

To verify that rsl 1257655 and not rs36062557 accounted for 
allele-specific effects, we used site-directed mutagenesis to 
construct the remaining haplotype combinations. The T risk 
allele of rsl 1257655 exhibited >1.8 fold increased transcriptional 
activity compared to the non-risk allele C independent of 
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Figure 1 . Regulatory potential at type 2 diabetes-associated SNPs at the CDC1 23/CAMK1 D locus. A) The 11 SNPs in high LD (r>7, EUR) 
with GWAS index SNP rs1 2779790. Arrows indicate the two SNPs that overlap islet, liver, and HepG2 open chromatin and epigenomic marks and that 
are located near to HepG2 ChlP-seq peaks; these two SNPs were tested for allele-specific transcriptional activity. B) DNase hypersensitivity peaks 
identified in two pooled islet samples from the ENCODE Consortium. C) FAIRE peaks identified in one representative islet sample from the ENCODE 
Consortium. D) H3K4me1 histone modifications from the Roadmap Epigenomics Consortium. E) FOXA1 and FOXA2 ChlP-seq peaks and signal from 
ENCODE. Image is taken from the UCSC genome browser, February 2009 (GRCh37/hg19) assembly (http://genome.ucsc.edu) [51]. The 5' end of 
CAMK1D begins after position 12,390,000. 
doi:1 0.1 371 /journal.pgen.1 004633.g001 



rs36062557 genotype (Figure 3A, B). In contrast, altering alleles of 
rs36062557 on a consistent rsl 1257655 background showed no 
significant effect on transcriptional activity. Taken together, these 
data confirm that rsl 1257655 exhibits allelic differences in 
transcriptional enhancer activity and suggest it functions within 
a cw-regulatory element at the CDC123/CAMK1D type 2 
diabetes-associated locus. 

Alleles of rsl 1 257655 differentially bind FOX transcription 
factors 

To assess whether alleles of rsl 1257655 differentially affect 
protein-DNA binding in vitro, biotin-labeled probes surrounding 
the T (risk) or C (non-risk) allele were incubated with 832/13, 
MIN6 or HepG2 nuclear lysate and subjected to electrophoretic 
mobility shift assays (EMSA). Band shifts indicative of multiple 
DNA-protein complexes were observed for both rsl 1257655 
alleles (Figure 4A, 4B, 4C). In EMSAs from all three cell nuclear 
extracts, protein complexes were observed for the probe contain- 
ing the T allele that were not present for the probe containing the 
C allele (832/ 1 3, arrow a; MIN6, arrows b, c, d; HepG2, arrows e, 
f) suggesting differential protein binding dependent on the 
rsl 1257655 allele. Competition of labeled T-allele probe with 
excess unlabeled T-allele probe more efficiently competed away 
allele-specific bands than excess unlabeled C-allele probe, 
demonstrating allele-specificity of the protein-DNA complexes 
(Figure 4A, 4B, 4C). rsl 1257655 did not show a differential 
protein binding pattern in EMSA using 3T3-L1 mouse adipocytes. 
To examine transcription factor binding to rsl 1257655, we used a 
DNA-affinity capture assay. We observed one protein band 
showing allele-specific binding to the T allele (Figure 4D) that 



was identified as transcription factor FOXA2 using MALDI 
TOF/TOF mass spectrometry. 

A search in the JASPAR CORE database provided further 
evidence that the rsl 1257655 SNP is located within predicted 
binding sites for FOXA1 and FOXA2, with only the T risk-allele 
predicted to contain a FOXA1 and FOXA2 consensus core- 
binding motif (Figure 4E) [23]. To assess binding to FOXA1 and 
FOXA2, we performed supershift experiments incubating DNA- 
protein complexes with antibodies for these factors. Incubation of 
the T allele-protein complex with FOXA1 antibody resulted in a 
band supershift in 832/ 1 3 and HepG2 cells (asterisk, Figure 4A, 
4C) A FOXA2 -mediated supershift was observed in 832/13, 
MIN6 and HepG2 cells (asterisk, Figure 4A, 4B, 4C). Differences 
in antibody species reactivity may account for the lack of a visible 
FOXA1 -mediated supershift in MIN6 cells. Collectively, these 
results suggest that rsl 1257655 is located in binding sites for a 
transcriptional regulator complex including FOXA1 and/or 
FOXA2, which bind preferably to the rsll257655-T allele in 
beta cell and liver cell lines. 

FOXA1 and FOXA2 occupancy at rsl 1257655 in human 
islets 

To evaluate whether FOXA1 and FOXA2 bind differentially to 
rsl 1257655 in a native chromatin context, we performed allele- 
specific ChlP in human islets with different rsl 1257655 genotypes. 
FOXA1 was enriched 7.2-fold compared to IgG control in islets 
carrying a T allele while FOXA1 was not enriched in islets 
homozygous for C allele (Figure 5A). Although less robust, 
FOXA2 was enriched 4.2-fold in islets carrying a T allele 
compared to IgG control (Figure 5B). This direction of enrich- 
ment is consistent with the EMSA data (Figure 4). A region 28 kb 
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Figure 2. Haplotype containing type 2 diabetes-associated 
SNPs displays differential transcriptional activity. Enhancer 
activity was tested in 832/13, MIN6 and HepG2 cells for the type 2 
diabetes non-risk (white bars) and risk (black bars) haplotypes in the 
forward and reverse orientations with respect to the genome. Risk 
refers to the rsl 1257655 variant; rs36062557 is included in the 
haplotype due to proximity. The haplotype containing risk allele 
rs11257655-T shows greater transcriptional activity than the non-risk 
allele rs11257655-C in both orientations with respect to a minimal 
promoter vector in 832/13 cells (A), MIN6 cells (B) and HepG2 cells (C). 
Error bars represent standard deviation of 4-5 independent clones for 
each allele. Firefly luciferase activity was normalized to Renilla luciferase 
activity, and normalized results are expressed as fold change compared 
to empty vector control. P values were calculated by a two-sided f-test. 
doi:1 0.1 371 /journal.pgen.1 004633.g002 



downstream of rsl 1257655 with no evidence of open chromatin 
(chrlO control) was used as a negative control (Figure S3). These 
findings strengthen the conclusion that rsl 1257655 is part of a 



bona fide as-regulatory complex binding FOXA 1 and/ or FOXA2 
in human islets. 

CDC123 and CAMK1D transcript levels 

To determine whether CDCA23 or CAMK1D are expressed in 
type 2 diabetes-relevant tissues, we measured and confirmed 
expression of both transcripts in human islets and hepatocytes 
(Figure S4A, S4B). These data are supported by RNA-seq 
evidence that both genes are expressed in islets [24]. Based on 
our results showing islet beta cells as a target tissue of risk variant 
regulatory activity, we assessed whether glucose treatment 
regulated CDC123 and CAMK1D transcript level. Glucose- 
mediated transcriptional changes in one of these genes might 
point to the more plausible candidate important in beta cell 
biology. In MIN6 cells treated with low (3 mM) and high (20 mM) 
concentrations of glucose for 16 hours, CAMK1D expression 
increased (P=.004; Figure S4C) while CDC 123 expression 
remained unchanged (P = .22; Figure S4D). In 832/13 cells, 
CDC123 levels were significantly higher in cells stimulated with 
high glucose (P= 1.6x10 ; Figure S4E). We could not assess the 
effect of glucose on CAMK1D levels in 832/13 cells because this 
transcript level was below detection limits. While we confirm 
expression of CAMK1D and CDC123 in islets and hepatocytes, 
future studies over-expressing the target gene(s) in these tissues 
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Figure 3. rsl 1257655 drives differential transcriptional activ- 
ity. Site-directed mutagenesis was carried out to separate the effects of 
rs36062557 from rsl 1257655. Enhancer activity was tested in 832/13 
and MIN6 and cells for the type 2 diabetes non-risk (white bars) and risk 
(black bars) haplotypes in the forward orientation. The risk allele 
rs11257655-T shows greater transcriptional activity than the non-risk 
allele rs11257655-C independent of rs36062557 genotype in 832/13 
cells (A) and MIN6 cells (B). Error bars represent standard deviation of 2- 
4 independent clones for each allele. Results are expressed as fold 
change compared to empty vector control. P values were calculated by 
a two-sided f-test. 

doi:10.1371/journal.pgen.1004633.g003 
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would be necessary to establish the mechanisms by which 
increased expression leads to diabetes risk. 

Discussion 

Integration of genome-wide regulatory annotation maps with 
disease-associated variants identified through GWAS has great 
potential for elucidation of gene-regulatory variants underlying 
association signals. In this study, we expand the lexicon of disease- 
associated functional regulatory variation by examining the type 2 
diabetes-association signal at the CDC 123 /CAMK1D locus. We 
prioritized candidate cw-regulatory variants and tested whether 
prioritized variants exhibited allele-specific transcriptional 



enhancer activity. We provide transcriptional reporter and 
protein-DNA binding evidence that rsl 1257655 is part of a as- 
regulatory complex differentially affecting transcriptional activity. 
Additionally, we validate FOXA1 and FOXA2 as components of 
this regulatory complex in human islets. 

In recent years, progress has been made in following up 
mechanistic studies of GWAS type 2 diabetes-association signals 
[6,7,9,25-30], but challenges remain in sifting through the many 
associated variants at a locus to identify those influencing disease. 
We hypothesized that a common variant with modest effect 
underlies the association at the CDC123/CAMK1D locus and 
evaluated the location of high LD variants (r 2 ^.7; n= 11) at the 
locus relative to known transcripts and to putative DNA regulatory 
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elements. We identified two variants that overlapped putative islet 
and/ or liver regulatory regions and none located in exons. We did 
not assess variants in lower LD (r <.7), and additional functional 
SNPs may exist at this locus acting through alternate functional 
mechanisms untested in the current study. 

Based on our observation of type 2 diabetes-associated SNPs in 
regions of islet and liver open chromatin, we measured transcrip- 
tional activity in two mammalian islet cell models, rat 832/13 and 
mouse MIN6 insulinoma cells and in one hepatocyte cell model, 
human HepG2 hepatocellular carcinoma cells. In agreement with 
our previous observations [7], we found good concordance in 
allelic transcriptional activity of human regulatory elements across 
the two rodent islet cell types. Of the two SNPs predicted to be 
located in predicted enhancer regions, rsl 1257655 but not 
rs36062557 demonstrated allele-specific effects in islets and liver, 
suggesting that rsl 1257655 is a lead functional candidate. The 
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Figure 5. rs11257655-T allele shows increased binding to 
FOXA1 and FOXA2 in human islets. FOXA1 (A) and FOXA2 (B) ChIP 
in human islets shows enrichment at rsl 1257655 compared to IgG 
control. Islets containing one copy of the rs11257655-T allele show 7.2- 
fold greater FOXA1 enrichment and 4.2-fold greater FOXA2 enrichment, 
rsl 1257655 CT heterozygotes are more significantly enriched than 
rsl 1257655 CC homozygotes for FOXA1 (one-sided f-test, P = .06) and 
FOXA2 (one-sided f-test, P = .026). A negative control region 28 kb 
downstream of rsl 1 257655 was not enriched in FOXA1- and FOXA2- 
bound chromatin (Figure S3A and S3B). Error bars represent standard 
error of two to three islets for each represented genotype. 
doi:1 0.1 371 /journal.pgen.1 004633.g005 



rsl 1257655-T allele associated with type 2 diabetes risk displayed 
increased enhancer activity relative to the C allele, suggesting that 
increased expression of one or more genes, possibly CAMK1D or 
CDC123, may be associated with type 2 diabetes. Our subsequent 
analysis of protein binding revealed complexes that favored the 
rsl 1257655-T allele in 832/13, MIN6 and HepG2 cells. 
Consistent with predictions that the rsll257655-C allele may 
disrupt binding to the FOXAl and FOXA2 transcription factors, 
we demonstrated that only the T allele of rsl 1257655 leads to 
FOXAl- and FOXA2-mediated supershifts. The ChIP enrich- 
ment of FOXAl and FOXA2 in human islets from carriers of the 
T allele is concordant with EMSAs using nuclear extract from 
mouse and rat cell lines, further demonstrating the utility of rodent 
islet cell models to characterize human regulatory elements. Our 
results suggest that a m-regulatory element surrounding 
rsl 1257655 may act in both islet and liver cells. Although we 
provide evidence that rsl 1257655 alleles differentially bind 
FOXAl and FOXAl in vivo, it is important to note that this 
enrichment was detected in isolated human islets. Future 
experiments will be needed to validate effects of rsl 1257655 
within a whole organism environment. For example, recendy 
zebrafish have been used to assay the regulatory potential of DNA 
sequences [31,32]. 

FOXAl and FOXA2 are members of the FOXA subclass of the 
forkhead box transcription factor family and are essential 
transcriptional activators in development of endodermally-derived 
tissues including liver and pancreas [33,34]. In mature mouse (3- 
cells, ablation of both transcription factors compared to ablation of 
FoxA2 alone leads to more pronounced impaired glucose 
homeostasis and insulin secretion, indicating that both factors 
are important in maintenance of the mature beta cell phenotype 
[35]. In addition, FoxA2 integrates the transcriptional response of 
mouse adult hepatocytes to a state of fasting [36]. FOXAl and 
FOXA2 are thought to act as pioneer transcription factors, 
scanning chromatin for enhancers with forkhead motifs and 
opening compacted chromatin through DNA demethylation and 
subsequent induction of H3K4 methylation, epigenetic changes 
that likely render enhancers transcriptionally competent by 
allowing subsequent recruitment of transcriptional effectors [37- 
39]. Our data demonstrate increased transcriptional activity and 
increased binding of FOXAl and FOXA2 to the rsl 1257655-T 
allele, suggesting that rsl 1257655 may be functioning as part of a 
transcriptional activator complex. Recent experiments in pancre- 
atic islets support a role for FOXA transcription factors in 
activation of islet enhancers [40] . This same study also showed 
that FOXA2 binds in pancreatic islets in the T2D-associated 
region surrounding rsl 1257655. Further experiments, such as 
ChlP-seq of additional transcription factors, may identify other 
key factors present in the activator complex. 

Both CAMK1D and CDC123 are candidate transcripts affected 
by variation at this locus. Cw-eQTLs in both blood and lung 
support an effect on CAMK1D but not CDC123. In blood, initial 
eQTL evidence for both genes were further analyzed by 
conditional analyses on the T2D lead SNP or rsl 1257655. The 
conditional analyses abolished the cw-eQTL signal for CAMK1D 
but not for CDC123, providing evidence that the T2D GWAS 
signal and the CAMK1D CM-eQTL signal are coincident [18]. In 
lung, the GTEx consortium identified an eQTL for CAMK1D 
with rsll257655 as a lead associated variant (P = l.lxlO - ); this 
and other T2D GWAS variants are the strongest cis-eQTLs for 
CAMK1D, while no significant eQTL is observed for CDC123 
[19]. For both eQTLs, the rsl 1257655 type 2 diabetes risk allele is 
associated with increased CAMK1D transcript level, consistent 
with the direction of transcriptional activity we observed for this 
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allele in islet and liver cells. Many eQTLs are predicted to be 
shared among tissues [41], and a recent study of the beta cell 
transcriptome reports good concordance of eQTL direction 
(R 2 = .74— .76) between beta cells and blood-derived lymphoblastoid 
cell lines, fat and skin [42], suggesting that the CAMK1D eQTL 
may also exist in islets. Some eQTLs differ across tissues, and 
evidence of a consistent eQTL in islets would be valuable. Knockout 
mice provide further evidence supporting CAMK1D as a target 
gene. In FoxAl/FoxA2 beta cell-specific knockout mice, Camkld 
expression was reported to be slightly reduced (1.8 fold, P = 0.13) 
[35], consistent with our conclusion that rsl 1257655 is part of a 
transcriptional activator complex that includes FOXA1 and 
FOXA2. Together, these data suggest that CAMK1D is a more 
plausible target for differential regulation by rsl 1257655 alleles. 

The mechanism by which CAMK1D may act in type 2 diabetes 
biology is unclear. CAMK1D is a serine threonine kinase that 
operates in the calcium-triggered CaMKK-CaMKl signaling 
cascade [17,43]. In response to calcium influx, CAMK1D activates 
CREB- (cAMP response element-binding protein) dependent gene 
transcription by phosphorylation [17]. CREB is a key beta cell 
regulator important in glucose sensing, insulin exocytosis and gene 
transcription and pi-cell survival [44], and FOXA2 has been shown 
to be necessary to mediate recruitment of CREB in fasting-induced 
activation of hepatic gluconeogenesis [36]. CAMK1D also has been 
reported to regulate glucose in primary human hepatocytes [45] . It 
is important to note that we cannot rule out cell cycle regulator 
CDC123 as a target for regulation by rsl 1257655. 

In conclusion, we extend follow up studies of GWAS-identified 
type 2 diabetes-associated variants to the CDC123/CAMK1D 
locus on chromosome 10. We identify rsl 1257655 as part of a cis 
regulatory complex in islet and liver cells that alters transcriptional 
activity through binding FOXA1 and FOXA2. These data 
demonstrate the utility of experimentally predicted chromatin 
state to identify regulatory variants for complex traits. 

Materials and Methods 

Selection of SNPs for functional study 

Variants were prioritized for functional study based on linkage 
disequilibrium (LD) and evidence of being in an islet or liver 
regulatory element based on data from the ENCODE consortium 
[46]. Of 1 1 variants meeting the LD threshold (r 2 &.7, EUR, with 
the GWAS index SNP rsl2779790, 1000G Phase 1 release), two 
SNPs showed evidence of open chromatin [6,9,20,47], histone 
modifications [21,22,48] or transcription factor binding and were 
tested for evidence of differential transcriptional activity. 

Cell culture 

Two insulinoma cell lines, rat-derived 832/13 [49] (C.B. 
Newgard, Duke University) and mouse-derived MIN6 [50] were 
maintained at 37°C with 5% C0 2 . 832/13 cells were cultured in 
RPMI 1640 (Cellgro/Corning) supplemented with 10% FBS, 
1 mil sodium pyruvate, 2 mM L-glutamine, 10 mM HEPES and 
0.05 mM (3-mercaptoethanol. MIN6 cells were cultured in 
DMEM (Sigma), supplemented with 10% FBS, 1 mM sodium 
pyruvate, 0.1 mM fi-mercaptoethanol. HepG2 hepatocellular 
carcinoma cells were cultured in MEM-alpha (Gibco) supple- 
mented with 10% FBS, 1 mM sodium pyruvate and 2 mM L- 
glutamine. 

Generation of luciferase reporter constructs, transient 
DNA transfection and luciferase reporter assays 

Fragments surrounding each of rsl 1257655 (151 bp) and 
rs34428576 (179 bp) were PCR-amplified (Table SI) from DNA 



of individuals homozygous for risk and non-risk alleles. Restriction 
sites for Kpnl and Xhol were added to primers during 
amplification, and the resulting PCR products were digested with 
Kpnl and Xhol and cloned in both orientations into the multiple 
cloning site of the minimal promoter-containing firefly luciferase 
reporter vector pGL4.23 (Promega, Madison, WI). Fragments are 
designated as 'forward' or 'reverse' based on their orientation with 
respect to the genome. Two to five independent clones for each 
allele for each orientation were isolated, verified by sequencing, 
and transfected in duplicate into 832/13, MIN6 and HepG2 cell 
lines. Missing haplotypes of rs36062557-rsl 1257655 constructs 
were created using the QuikChange site directed mutagenesis kit 
(Stratagene). 

Approximately 1 xlO 5 cells per well were seeded in 24-well 
plates. At 80% confluency, cells were co-transfected with 
luciferase constructs and Renilla control reporter vector 
(phRL-TK, Promega) at a ratio of 10:1 using Lipofectamine 
2000 (Invitrogen) for 832/13, and using FUGENE-6 for MIN6 
and HepG2 cells (Roche Diagnostics, Indianapolis, IN). 48 h 
after transfection, cells were lysed with passive lysis buffer 
(Promega), and luciferase activity was measured using the Dual- 
luciferase assay system (Promega). To control for transfection 
efficiency, raw values for firefly luciferase activity were divided 
by raw Renilla luciferase activity values, and fold change was 
calculated as normalized luciferase values divided by pGL4.23 
minimal promoter empty vector control values. Data are 
reported as the fold change in mean (± SD) relative luciferase 
activity per allele. A two-sided i-test was used to compare 
luciferase activity between alleles. All experiments were carried 
out on a second independent day and yielded comparable allele- 
specific results. 

Electrophoretic mobility shift assay (EMSA) 

Nuclear cell extracts were prepared from 832/13, MIN6, and 
HepG2 cells using the NE-PER nuclear and cytoplasmic 
extraction kit (Thermo Scientific) according to the manufacturer's 
instructions. Protein concentration was measured with a BCA 
protein assay (Thermo Scientific), and lysates were stored at — 
80°C until use. 21 bp oligonucleotides were designed to the 
sequence surrounding rsl 1257655 risk or non-risk alleles: Sense 5' 
biotin- GGGC AAGTGT [C /T] TACTGGGC AT 3', antisense 5' 
biotin- ATGCCCAGTA[G/A]ACACTTGCCC 3' (SNP allele in 
bold). Double-stranded oligonucleotides for the risk and non risk 
alleles were generated by incubating 50 pmol complementary 
oligonucleotides at 95°C for 5 minutes followed by gradual cooling 
to room temperature. EMSA's were carried out using the 
LightShift Chemiluminescent EMSA Kit (Thermo Scientific). 
Binding reactions were set up as follows: 1 x binding buffer, 
50 ng/|j.L poly (dl'dC), 3 u.g nuclear extract, 200 fmol of labeled 
probe in a final volume of 20 uL. For competition reactions, 67- 
fold excess of unlabeled double-stranded oligonucleotides for 
either the risk or non-risk allele were included. Reactions were 
incubated at room temperature for 25 minutes. For supershift 
assays, 4 u.g of polyclonal antibodies against FOXA1 (ab23738; 
Abeam) or FOXA2 (SC6554X; Santa Cruz Biotechnology) was 
added to the binding reaction and incubation proceeded for a 
further 25 minutes. Binding reactions were subjected to non- 
denaturing PAGE on DNA retardation gels in 0.5 x TBE (Lonza), 
transferred to Biodyne nylon membranes (Thermo Scientific) and 
cross-linked on a UV-light cross linker (Stratagene). Biotin labeled 
DNA-protein complexes were detected by chemiluminescence. 
EMSAs were carried out on a second independent day and yielded 
comparable. 
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DNA affinity capture assay 

DNA affinity capture was carried out as previously described 
[7]. Briefly, dialyzed nuclear extracts (300 |Ag) were pre-cleared 
with 100 u.1 of streptavidin-agarose dynabeads (Invitrogen) cou- 
pled to biotin-labeled scrambled control oligonucleotides. For 
DNA-protein binding reactions, 40 pmol of biotin labeled probe 
for either rsl 1257655 allele (same probe as for EMSA) or for a 
scrambled control were incubated with 300 u.g nuclear extract, 
binding buffer (10 mil Tris, 50 mM KCL, 1 mM DTT), 
0.055 |rg/|xL poly (dl'dC) and H 2 0 to total 450 uX at room 
temperature for 30 minutes with rotation. 100 uL (1 mg) of 
streptavidin-agarose dynabeads were added and the reaction 
incubated for a further 20 minutes. Beads were washed and DNA- 
bound proteins were eluted in 1 x reducing sample buffer 
(Invitrogen). Proteins were separated on NuPAGE denaturing 
gels and protein bands stained with SYPRO-Ruby. Protein bands 
displaying differential binding between rsl 1257655 alleles were 
excised from the gel and subjected to matrix assisted laser 
desorption time-of-flight/time-of-flight tandem mass spectrometry 
(MS) and analysis at the University of North Carolina proteomics 
core facility. For peptide identification, all MS/MS spectra were 
searched against all entries, NCBI non-redundant (NR) database, 
using GPS Explorer Software Version 3.6 (ABI) and the Mascot 
(MatrixScience) search algorithm. Mass tolerances of 80 ppm for 
precursor ions and 0.6 Da for fragment ions were used. In 
addition, two missed cleavages were allowed and oxidation of 
methionine was a variable modification. 

Chromatin Immunoprecipitation (ChIP) assays 

Human islets from non-diabetic organ donors were provided by 
the National Disease Research Interchange (NDRI). Use of 
human tissues was approved by the University of North Carolina 
Institutional Review Board. Islet viability and purity were assessed 
by the NDRI. Islets were warmed to 37°C and washed with 
calcium- and magnesium-free Dulbecco's phosphate-buffered 
saline (Life Technologies) prior to crosslinking. For chromatin 
immunoprecipitation (ChIP) studies, approximately 2000 islet 
equivalents (IEQs) were crosslinked for 10 min in 1% formalde- 
hyde (Sigma-Aldrich) at room temperature. Islets were lysed and 
chromatin was sheared on ice using a standard bioruptor 
(Diagenode; 20-22 cycles of 30 s sonication with 1 min rest 
between cycles) to a size of 200-1000 bp. IP dilution buffer (0.01% 
SDS, 1.1% Triton X-100, 1.2 mM EDTA, 16.7 mM Tris at 
pH 8.1, 167 mM NaCl, protease inhibitors) was added, 5% of the 
volume was removed and used as input, and the remainder was 
incubated overnight at 4°C on a nutating platform with FOXA1 
or FOXA2 antibody or a species-matched IgG as control. 
Antibodies used for ChIP were the same as for EMSA; FOXA1 
(Abeam) and FOXA2 (Santa Cruz). Protein A agarose beads 
(Santa Cruz) were added and incubated for 3 h at 4°C. Beads were 
then washed for 5 minutes at 4°C with gentle mixing, using the 
following solutions: Low Salt Buffer (0. 1 % SDS, 1 % Triton X-100, 

2 mM EDTA, 20 mM Tris, 150 mM NaCl); High Salt Buffer 
(0.1% SDS, 1% Triton X-100, 2 mM EDTA, 20 mM Tris, 
500 mM NaCl); LiCl buffer (1 mM EDTA, 10 mM Tris, 250 mM 
LiCl, 1% NP-40, 1% Na-Deoxycholate), twice; and TE buffer 
(Sigma-Aldrich), twice. Chromatin was eluted from beads with two 
15-minute washes at 65°C using freshly prepared Elution Buffer 
(1% SDS/0.1 M NaHCOg). To reverse crosslinks, 5 M NaCl was 
added to each sample to a final concentration of 0.2 M, and 
incubated overnight at 65°C; to remove protein, samples were 
incubated with 10 uL 0.5 M EDTA, 20 uL 1 M Tris (pH 6.5) and 

3 uL of Proteinase K (10 mg/mL) at 45°C for 3 hours. DNA was 
extracted with 25:24:1 phenol:choloform:isoamyl alcohol, 



precipitated with 100% ethanol with 1 u.1 glycogen as a carrier, 
and resuspended in TE (Sigma). qPCR was performed in triplicate 
using SYBR Green Master Mix. Primers were designed to amplify 
a 99-bp region surrounding rsl 12576555; 5'-CTACT- 
GCTTCTCCGGACTCG '3' and 5'- TGGCCTCAAGAGG 
GAGATAA -3'. Primers for a 133-bp control region not 
overlapping open chromatin and located 27 kb away were 5'- 
GCACCCATGGTACTGAAACC -3' and 5'- CTTTTCCCG 
AGGAAGGAACT -3'. Dissociation curves demonstrated a single 
PGR product in each case without primer dimers. Fold 
enrichment was calculated as FOXA1/FOXA2 enrichment 
divided by IgG control. A one-sided t-test was performed to 
compare enrichment based on the direction of binding observed 
using EMSA. 

Effect of glucose on Cdc123 and Camkld transcript level 

To measure effects of glucose on expression of Cdcl23 and 
Camkld, 832/13 cells and MIN6 cells were washed with PBS and 
preincubated for 2.0 h in secretion buffer (1 14 mm NaCl, 4.7 mm 
KC1, 1.2 mm KH 2 P0 4 , 1.16 mm MgS0 4 , 20 mm HEPES, 
2.5 mm CaCl 2 , 0.2% BSA, pH 7.2. For GSIS, cells were 
incubated in secretion buffer for an additional 2 hours or 16 hours 
in the presence of 3 mM or 20 mM glucose and then harvested for 
RNA. 

RNA isolation and quantitative real-time reverse- 
transcription PCR 

Total cytosolic RNA was isolated using the RNeasy Mini Kit 
(Qiagen). RNA concentrations were determined using a Nanodrop 
1000 (Thermo Scientific, Wilmington, DE, USA). For real-time 
reverse transcription (RT)-PCR, first-strand cDNA was synthe- 
sized using 8 ul of total RNA in a 20 u.1 reverse transcriptase 
reaction mixture (Superscript III First strand synthesis kit; Life 
Technologies). cDNA was diluted to contain equivalent to 20- 
55 ng/ui input RNA. To measure total human mRNA levels of 
CDC123, CAMK1D and B2M, gene-specific primers and fast 
SYBR Green Master Mix (Life Technologies) were used (Table 
S2). TaqMan designed gene expression assays (Life Technologies) 
were used to measure Cdcl23, CamklD and Rsp9 (housekeeping 
gene) mRNA levels of mouse and rat cells. All PCR reactions were 
performed in triplicate in a 10-u.l volume using a STEPOne Plus 
real-time PCR system (Life Technologies). Serial 3-fold dilutions of 
cDNA from pooled human tissues, 832/13 or MIN6 cells as 
appropriate were used as a reference for a standard curve. 
Statistical significance was determined by two-tailed i-tests. 

Supporting Information 

Figure SI Regulatory potential at rsl 1257655 and rs36062557. 
UCSC genome browser (hgl8) diagram showing that rsl 1257655 
and rs36062557 overlap regions of open chromatin, detected by 
DNase hypersensitivity and FAIRE, and histone modifications, 
including H3K4mel and H3K9ac in islet, liver, and HepG2 cells. 
H3K27ac and H3K4me3 histone modifications are also shown, 
rsl 1257655 and rs36062557 are also located near to HepG2 
ChlP-seq peaks for FOXA1 and FOXA2. DNA sequences 
amplified to evaluate transcriptional activity in dual-luciferase 
reporter assays and to evaluate enrichment of binding to FOXA1 
and FOXA2 are indicated. 
(TIF) 

Figure S2 Transcriptional activity at rs34428576. Enhancer 
activity was measured in 832/13 cells (A) and HepG2 cells (B) for 
alleles of rs34428576. No difference was observed between alleles 
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in 832/13 cells. In HepG2 cells, moderate allele-specific activity 
was observed only in the reverse orientation. Error bars represent 
standard deviation of 4—5 independent clones for each allele. 
Results are expressed as fold change compared to empty vector 
control. P values were calculated by a two-sided t-test. 
(TIF) 

Figure S3 Chromosome 10 region not overlapping open 
chromatin does not show binding to FOXA1 and FOXA2 in 
human islets. A negative control region 28 kb downstream of 
rsl 1257655 was not substantially enriched in FOXA1- (A) and 
FOXA2- (B) bound chromatin. Error bars represent standard 
error of two to three islets for each represented genotype. 
(TIF) 

Figure S4 CDC123 and CAMK1D expression and response to 
glucose. (A, B) Evidence that CAMK1D and CDC123 are 
expressed in various human tissues. cDNA from human islets, 
hepatocytes, blood and adipocytes was analyzed by real-time PCR 
using gene-specific primers for CAMK1D (A) and CDC123 and 
B2M (B). mRNA level was normalized to B2M. (C, D, E) Effect of 
glucose stimulus on CAMK1D and CDC123 expression level. 
832/ 1 3 and MIN6 insulinoma cells were treated with low (3 mM) 
and high (15 mM) glucose for 16-18 hours. cDNA was analyzed 
by real-time PCR using TaqMan gene expression assays for 
CAMK1D (C) and CDC123 (D, E). mRNA level was normalized 
to RSP9. High glucose treatment resulted in a significant increase 
in CAMK1D mRNA level (C) but not CDC123 in MIN6 cells (D). 
High glucose treatment resulted in increased CDC 123 mRNA level 
in 832/13 cells. Error bars represent the standard deviation of 4-5 
samples for each treatment. P values were calculated by a two- 
sided /-test. 
(TIF) 
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